Learning Mixtures of Bernoulli Templates by Two-Round EM with Performance Guarantee

نویسندگان

  • Adrian Barbu
  • Tianfu Wu
  • Ying Nian Wu
چکیده

Dasgupta and Shulman [1] showed that a two-round variant of the EM algorithm can learn mixture of Gaussian distributions with near optimal precision with high probability if the Gaussian distributions are well separated and if the dimension is sufficiently high. In this paper, we generalize their theory to learning mixture of high-dimensional Bernoulli templates. Each template is a binary vector, and a template generates examples by randomly switching its binary components independently with a certain probability. In computer vision applications, a binary vector is a feature map of an image, where each binary component indicates whether a local feature or structure is present or absent within a certain cell of the image domain. A Bernoulli template can be considered as a statistical model for images of objects (or parts of objects) from the same category. We show that the two-round EM algorithm can learn mixture of Bernoulli templates with near optimal precision with high probability, if the Bernoulli templates are sufficiently different and if the number of features is sufficiently high. We illustrate the theoretical results by synthetic and real examples.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mapping Energy Landscapes of Non-Convex Learning Problems

In many statistical learning problems, the target functions to be optimized are highly non-convex in various model spaces and thus are difficult to analyze. In this paper, we compute Energy Landscape Maps (ELMs) which characterize and visualize an energy function with a tree structure, in which each leaf node represents a local minimum and each non-leaf node represents the barrier between adjac...

متن کامل

EM Initialisation for Bernoulli Mixture Learning

Mixture modelling is a hot area in pattern recognition. This paper focuses on the use of Bernoulli mixtures for binary data and, in particular, for binary images. More specifically, six EM initialisation techniques are described and empirically compared on a classification task of handwritten Indian digits. Somehow surprisingly, we have found that a relatively good initialisation for Bernoulli ...

متن کامل

A Probabilistic Analysis of EM for Mixtures of Separated, Spherical Gaussians

We show that, given data from a mixture of k well-separated spherical Gaussians in Rd , a simple two-round variant of EM will, with high probability, learn the parameters of the Gaussians to nearoptimal precision, if the dimension is high (d lnk). We relate this to previous theoretical and empirical work on the EM algorithm.

متن کامل

Multivariate Structural Bernoulli Mixtures for Recognition of Handwritten Numerals

As shown recently, the structural optimization of probabilistic neural networks can be included into EM algorithm by introducing a special type of multivariate Bernoulli mixtures. However, the underlying loglikelihood criterion is known to be multimodal in case of mixtures and therefore the EM iteration process may be starting-point dependent. In the present paper we discuss the possibility of ...

متن کامل

A two-sided Bernoulli-based CUSUM control chart with autocorrelated observations

Usually, in monitoring a proportion p < /em>, the binary observations are considered independent; however, in many real cases, there is a continuous stream of autocorrelated binary observations in which a two-state Markov chain model is applied with first-order dependence. On the other hand, the Bernoulli CUSUM control chart which is not robust to autocorrelation can be applied two-sided co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013